Candidate re-ranking for SMT-based grammatical error correction
نویسندگان
چکیده
We develop a supervised ranking model to rerank candidates generated from an SMT-based grammatical error correction (GEC) system. A range of novel features with respect to GEC are investigated and implemented in our reranker. We train a rank preference SVM model and demonstrate that this outperforms both Minimum Bayes-Risk and Multi-Engine Machine Translation based re-ranking for the GEC task. Our best system yields a significant improvement in I-measure when testing on the publicly available FCE test set (from 2.87% to 9.78%). It also achieves an F0.5 score of 38.08% on the CoNLL-2014 shared task test set, which is higher than the best original result. The oracle score (upper bound) for the re-ranker achieves over 40% I-measure performance, demonstrating that there is considerable room for improvement in the re-ranking component developed here, such as incorporating features able to capture long-distance dependencies.
منابع مشابه
Grammatical Error Correction
Grammatical error correction (GEC) is the task of automatically correcting grammatical errors in written text. Earlier attempts to grammatical error correction involve rule-based and classifier approaches which are limited to correcting only some particular type of errors in a sentence. As sentences may contain multiple errors of different types, a practical error correction system should be ab...
متن کاملDiscriminative Reranking for Grammatical Error Correction with Statistical Machine Translation
Research on grammatical error correction has received considerable attention. For dealing with all types of errors, grammatical error correction methods that employ statistical machine translation (SMT) have been proposed in recent years. An SMT system generates candidates with scores for all candidates and selects the sentence with the highest score as the correction result. However, the 1-bes...
متن کاملGrammatical error correction using hybrid systems and type filtering
This paper describes our submission to the CoNLL 2014 shared task on grammatical error correction using a hybrid approach, which includes both a rule-based and an SMT system augmented by a large webbased language model. Furthermore, we demonstrate that correction type estimation can be used to remove unnecessary corrections, improving precision without harming recall. Our best hybrid system ach...
متن کاملExploiting N-Best Hypotheses to Improve an SMT Approach to Grammatical Error Correction
Grammatical error correction (GEC) is the task of detecting and correcting grammatical errors in texts written by second language learners. The statistical machine translation (SMT) approach to GEC, in which sentences written by second language learners are translated to grammatically correct sentences, has achieved state-of-the-art accuracy. However, the SMT approach is unable to utilize globa...
متن کاملThe Effect of Learner Corpus Size in Grammatical Error Correction of ESL Writings
English as a Second Language (ESL) learners’ writings contain various grammatical errors. Previous research on automatic error correction for ESL learners’ grammatical errors deals with restricted types of learners’ errors. Some types of errors can be corrected by rules using heuristics, while others are difficult to correct without statistical models using native corpora and/or learner corpora...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016